{ "cells": [ { "cell_type": "markdown", "id": "55faa28d-5e5a-4c11-9433-bf1731ccd5e6", "metadata": {}, "source": [ "# Manipulating QMzymeRegion\n", "## Objective\n", "\n", "The objective of this tutorial is to show different ways in which QMzymeRegion can be modified. We will highlight some of the ways in which the user can manipulate QMzymeRegion to get a desirable selection for QM input generation. This workflow allows you to:\n", "\n", "- Learn methods to combine and subtract QMzymeRegion objects.\n", "\n", "In this specific example, we are using ketosteroid isomerase (KSI) as the model system. The structure for KSI is obtained from the PDB [1OH0](https://doi.org/10.2210/pdb1OH0/pdb) and MM-minimized prior to this tutorial.\n", "\n", "## Classes used in this example\n", "\n", "- [Generate Model](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.GenerateModel.html)\n", "- [QM_method](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.CalculateModel.html#qm-treatment)\n", "- [SelectionSchemes](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.SelectionSchemes.html#)\n", "- [DistanceCutoff SelectionSchemes](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.SelectionSchemes.html#QMzyme.SelectionSchemes.DistanceCutoff)\n", "- [QMzymeRegion](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.QMzymeRegion.html)\n", " - [subtract method](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.QMzymeRegion.html#QMzyme.QMzymeRegion.QMzymeRegion.subtract)\n", " - [combine method](https://qmzyme.readthedocs.io/en/latest/API/QMzyme.QMzymeRegion.html#QMzyme.QMzymeRegion.QMzymeRegion.combine)\n", "\n", "## Required Files\n", "To start, you will need:\n", "\n", "- A fully prepped and protonated PDB\n", " \n", "---" ] }, { "cell_type": "code", "execution_count": 8, "id": "85333d06-7e1d-41d1-a269-2faa99086a0c", "metadata": {}, "outputs": [], "source": [ "# Here are the necesary imports for this tutorial!\n", "\n", "import QMzyme\n", "from QMzyme import GenerateModel\n", "from QMzyme.SelectionSchemes import DistanceCutoff\n", "from QMzyme.data import PDB\n", "from QMzyme.RegionBuilder import RegionBuilder\n", "import pandas as pd\n", "import MDAnalysis" ] }, { "cell_type": "markdown", "id": "aac2d434-93c6-4369-8c66-7b0eaf225da1", "metadata": {}, "source": [ "## Combining Two Regions\n", "\n", "We'll first look at combining two regions! To achieve this, we can use `combine()` method in QMzymeRegion class. Using it is quite simple: you decide on the base region and region you want to add, then simply combine them using `combine()`. In here, we will use it to add Tyr 57 to our distance cutoff of 3 Å." ] }, { "cell_type": "code", "execution_count": 11, "id": "3d8b1279-ad21-4282-87ee-07f966a23ac0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Charge information not present. QMzyme will try to guess region charges based on residue names consistent with AMBER naming conventions (i.e., aspartate: ASP --> Charge: -1, aspartic acid: ASH --> Charge: 0.). See QMzyme.data.residue_charges for the full set.\n", "\n", "\tNonconventional Residues Found\n", "\t------------------------------\n", "\tEQU --> Charge: UNK, defaulting to 0\n", "\n", "You can update charge information for nonconventional residues by running \n", "\t>>>QMzyme.data.residue_charges.update({'3LETTER_RESNAME':INTEGER_CHARGE}). \n", "Note your changes will not be stored after you exit your session. It is recommended to only alter the residue_charges dictionary. If you alter the protein_residues dictionary instead that could cause unintended bugs in other modules (TruncationSchemes).\n", "\n" ] } ], "source": [ "# We first initialize model and update the unknown residue charge.\n", "model = QMzyme.GenerateModel(PDB)\n", "QMzyme.data.residue_charges.update({'EQU': -1}) \n", "\n", "# We create regions of interest.\n", "model.set_catalytic_center(selection='resname EQU and segid A')\n", "model.set_region(selection=DistanceCutoff, cutoff=3)\n", "model.set_region(selection=\"resid 57\", name=\"Tyr_57\")\n", "\n", "# We combine Tyr_57 region to cutoff_3 region.\n", "combined_region = model.get_region(\"cutoff_3\")\n", "combined_region = combined_region.combine(model.Tyr_57)\n", "model.set_region(selection=combined_region, name=f\"combined_region\")" ] }, { "cell_type": "markdown", "id": "c0b06f9c-3fb6-4da3-aa64-f3b2c40fc26f", "metadata": {}, "source": [ "We can examine the region using pandas and `summarize()` method. We'll first look at our cutoff_3 region, then compare it with combined_region!" ] }, { "cell_type": "code", "execution_count": 12, "id": "0c710afb-5f08-43ea-bd9a-ce73f1e89947", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ResidResnameChargeRemoved atomsFixed atomsSegids
016TYR0[][]A
120VAL0[][]A
240ASP-1[][]A
360GLY0[][]A
461LEU0[][]A
566VAL0[][]A
686PHE0[][]A
788VAL0[][]A
890MET0[][]A
999LEU0[][]A
10101VAL0[][]A
11103ASH0[][]A
12118ALA0[][]A
13120TRP0[][]A
14263EQU-1[][]A
15372WAT0[][]A
16373WAT0[][]A
17376WAT0[][]A
18378WAT0[][]A
\n", "
" ], "text/plain": [ " Resid Resname Charge Removed atoms Fixed atoms Segids\n", "0 16 TYR 0 [] [] A\n", "1 20 VAL 0 [] [] A\n", "2 40 ASP -1 [] [] A\n", "3 60 GLY 0 [] [] A\n", "4 61 LEU 0 [] [] A\n", "5 66 VAL 0 [] [] A\n", "6 86 PHE 0 [] [] A\n", "7 88 VAL 0 [] [] A\n", "8 90 MET 0 [] [] A\n", "9 99 LEU 0 [] [] A\n", "10 101 VAL 0 [] [] A\n", "11 103 ASH 0 [] [] A\n", "12 118 ALA 0 [] [] A\n", "13 120 TRP 0 [] [] A\n", "14 263 EQU -1 [] [] A\n", "15 372 WAT 0 [] [] A\n", "16 373 WAT 0 [] [] A\n", "17 376 WAT 0 [] [] A\n", "18 378 WAT 0 [] [] A" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(model.cutoff_3.summarize())\n", "df" ] }, { "cell_type": "code", "execution_count": 13, "id": "a0767665-8a2e-4fa5-b0d6-a55a113e7864", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ResidResnameChargeRemoved atomsFixed atomsSegids
016TYR0[][]A
120VAL0[][]A
240ASP-1[][]A
357TYR0[][]A
460GLY0[][]A
561LEU0[][]A
666VAL0[][]A
786PHE0[][]A
888VAL0[][]A
990MET0[][]A
1099LEU0[][]A
11101VAL0[][]A
12103ASH0[][]A
13118ALA0[][]A
14120TRP0[][]A
15263EQU-1[][]A
16372WAT0[][]A
17373WAT0[][]A
18376WAT0[][]A
19378WAT0[][]A
\n", "
" ], "text/plain": [ " Resid Resname Charge Removed atoms Fixed atoms Segids\n", "0 16 TYR 0 [] [] A\n", "1 20 VAL 0 [] [] A\n", "2 40 ASP -1 [] [] A\n", "3 57 TYR 0 [] [] A\n", "4 60 GLY 0 [] [] A\n", "5 61 LEU 0 [] [] A\n", "6 66 VAL 0 [] [] A\n", "7 86 PHE 0 [] [] A\n", "8 88 VAL 0 [] [] A\n", "9 90 MET 0 [] [] A\n", "10 99 LEU 0 [] [] A\n", "11 101 VAL 0 [] [] A\n", "12 103 ASH 0 [] [] A\n", "13 118 ALA 0 [] [] A\n", "14 120 TRP 0 [] [] A\n", "15 263 EQU -1 [] [] A\n", "16 372 WAT 0 [] [] A\n", "17 373 WAT 0 [] [] A\n", "18 376 WAT 0 [] [] A\n", "19 378 WAT 0 [] [] A" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(model.combined_region.summarize())\n", "df" ] }, { "cell_type": "markdown", "id": "67c65ac4-9a55-4919-89aa-96ad6f29ee80", "metadata": {}, "source": [ "As you can see, Tyr 57 can be seen in combined_region, suggesting that our region has been successfully combined!" ] }, { "cell_type": "markdown", "id": "161c6426-11dd-44fa-a242-e46057f9ca44", "metadata": {}, "source": [ "## Subtracting Two Regions\n", "\n", "Now, let's subtract a region from our QMzyme region! To achieve this, we can use `subtract()` method in QMzymeRegion class. This time, we'll consider a case where you want to remove amino acid residues responsible for creating the oxyanion hole in KSI (Tyr 16 and Asp 103) to see how it influences coordination of the substrate." ] }, { "cell_type": "code", "execution_count": 17, "id": "37dbd4d5-4f92-47f9-a44e-4ba651ea4cdc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Charge information not present. QMzyme will try to guess region charges based on residue names consistent with AMBER naming conventions (i.e., aspartate: ASP --> Charge: -1, aspartic acid: ASH --> Charge: 0.). See QMzyme.data.residue_charges for the full set.\n" ] } ], "source": [ "# We first initialize model and update the unknown residue charge.\n", "model = QMzyme.GenerateModel(PDB)\n", "QMzyme.data.residue_charges.update({'EQU': -1}) \n", "\n", "# We create regions of interest.\n", "model.set_catalytic_center(selection='resname EQU and segid A')\n", "model.set_region(selection=DistanceCutoff, cutoff=3)\n", "model.set_region(selection=\"resid 16 or resid 103\", name=\"oxyanion_hole\")\n", "\n", "# We combine Tyr_57 region to cutoff_3 region.\n", "subtracted_region = model.get_region(\"cutoff_3\")\n", "subtracted_region = subtracted_region.subtract(model.oxyanion_hole)\n", "model.set_region(selection=subtracted_region, name=f\"subtracted_region\")" ] }, { "cell_type": "markdown", "id": "277355dd-47dc-4658-a6bc-118cbde5fa8d", "metadata": {}, "source": [ "We can examine the region using pandas and `summarize()` method. We'll first look at our cutoff_3 region, then compare it with subtracted_region!" ] }, { "cell_type": "code", "execution_count": 18, "id": "fe813aaa-b687-445d-b329-c53174315a4d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ResidResnameChargeRemoved atomsFixed atomsSegids
016TYR0[][]A
120VAL0[][]A
240ASP-1[][]A
360GLY0[][]A
461LEU0[][]A
566VAL0[][]A
686PHE0[][]A
788VAL0[][]A
890MET0[][]A
999LEU0[][]A
10101VAL0[][]A
11103ASH0[][]A
12118ALA0[][]A
13120TRP0[][]A
14263EQU-1[][]A
15372WAT0[][]A
16373WAT0[][]A
17376WAT0[][]A
18378WAT0[][]A
\n", "
" ], "text/plain": [ " Resid Resname Charge Removed atoms Fixed atoms Segids\n", "0 16 TYR 0 [] [] A\n", "1 20 VAL 0 [] [] A\n", "2 40 ASP -1 [] [] A\n", "3 60 GLY 0 [] [] A\n", "4 61 LEU 0 [] [] A\n", "5 66 VAL 0 [] [] A\n", "6 86 PHE 0 [] [] A\n", "7 88 VAL 0 [] [] A\n", "8 90 MET 0 [] [] A\n", "9 99 LEU 0 [] [] A\n", "10 101 VAL 0 [] [] A\n", "11 103 ASH 0 [] [] A\n", "12 118 ALA 0 [] [] A\n", "13 120 TRP 0 [] [] A\n", "14 263 EQU -1 [] [] A\n", "15 372 WAT 0 [] [] A\n", "16 373 WAT 0 [] [] A\n", "17 376 WAT 0 [] [] A\n", "18 378 WAT 0 [] [] A" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(model.cutoff_3.summarize())\n", "df" ] }, { "cell_type": "code", "execution_count": 19, "id": "775882b7-0683-4e09-878d-66c49b5d8264", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ResidResnameChargeRemoved atomsFixed atomsSegids
020VAL0[][]A
140ASP-1[][]A
260GLY0[][]A
361LEU0[][]A
466VAL0[][]A
586PHE0[][]A
688VAL0[][]A
790MET0[][]A
899LEU0[][]A
9101VAL0[][]A
10118ALA0[][]A
11120TRP0[][]A
12263EQU-1[][]A
13372WAT0[][]A
14373WAT0[][]A
15376WAT0[][]A
16378WAT0[][]A
\n", "
" ], "text/plain": [ " Resid Resname Charge Removed atoms Fixed atoms Segids\n", "0 20 VAL 0 [] [] A\n", "1 40 ASP -1 [] [] A\n", "2 60 GLY 0 [] [] A\n", "3 61 LEU 0 [] [] A\n", "4 66 VAL 0 [] [] A\n", "5 86 PHE 0 [] [] A\n", "6 88 VAL 0 [] [] A\n", "7 90 MET 0 [] [] A\n", "8 99 LEU 0 [] [] A\n", "9 101 VAL 0 [] [] A\n", "10 118 ALA 0 [] [] A\n", "11 120 TRP 0 [] [] A\n", "12 263 EQU -1 [] [] A\n", "13 372 WAT 0 [] [] A\n", "14 373 WAT 0 [] [] A\n", "15 376 WAT 0 [] [] A\n", "16 378 WAT 0 [] [] A" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(model.subtracted_region.summarize())\n", "df" ] }, { "cell_type": "markdown", "id": "be40ab18-a933-4360-8d40-6a09eef37b0b", "metadata": {}, "source": [ "As you can see, Tyr 16 and Asp 103 are no longer present in subtracted_region, suggesting that our region has been successfully subtracted!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 5 }